home
***
CD-ROM
|
disk
|
FTP
|
other
***
search
/
Cream of the Crop 20
/
Cream of the Crop 20 (Terry Blount) (1996).iso
/
os2
/
rsynth21.zip
/
readme.os2
< prev
next >
Wrap
Text File
|
1996-07-19
|
6KB
|
139 lines
This is a rough port of Rsynth2.0, an English text-> speech app originally
written for a Sun Sparcstation. Despite what I wrote in earlier versions,
I have made the changes required for this package to both output speech
directly vis MMPM, and to save WAV files as well as AU files. I'm still
interested in hearing from folks who are using this.
Rsynth, with just a little work, can be modified to become a Speech output
library, good for that talking Web Browser the world so desparately needs.
As a matter of fact, there's a print to Rsynth option in my Lynx/2 cfg file
right now. Or maybe the IRC client that speaks......
Using the Executables:
There are three OS/2 executables in this Zip file. They are all written
with EMX/GCC, so you will need the EMXRT stuff to use them. They are:
Say.exe - the actual text->speech app
mkdictdb.exe - Makes the required Dictionary Database file from
an external Dictionary
dlookup.exe - Looks up Phonemes from the database Dictionary.
Running Say.exe:
Typing "Say --help" on the command line will show you all the command line
options. The important ones for OS/2 are as follows:
-r samp rate Valid values for this are 8, 11, 22, and 44. This is the
sample rate used either for generating WAV files or
speaking directly. The default is 8 kHz. The default
value should be used for creating AU files, any valid
value may be used for WAV file creation or speaking
directly.
+Q Quiet mode - if specified on the command line, will not
speak to MMPM.
+C Some Sound Card device drivers play speech from Rsynth
twice as fast as they should, so the output sounds like
a chipmunk. Specifiying +C on the command line will
counter this 'chipmunk effect' by lying to MMPM about
playback sample rate.
-b 16 This will specify 16 bit playback (for sound systems
that can support this). The default is 8 bit.
-o filename This saves a file in AU format. If you have installed
MMOS/2 AU sound support, you can then use PLAY.CMD to
listen to it. AU sound support for MMOS/2 is part of
the Warp BonusPak - you will need to install this if
you have yet to do so.
-w filename This saves a file in WAV format. All WAV files are
stored as 8 bit mono, until I can figure oout how to
do otherwise.
-f frequency Fundamental Frequency for the voice - default is 1330 Hz.
-v Verbose output- print a lot of stuff to the screen while
doing the conversion.
<filename get text to say from file filename, otherwise takes
stuff to say from command line.
Mkdictdb.exe:
dlookup.exe :
I have not included a dictionary or a dictionary database in the distrib-
ution. Say will work without it, but is more accurate in pronunciation
with one. Read the README file for information as to where to find
American English and British English dictionaries and how to convert them
using mkdictdb.exe. I have placed the ASCII version of the source diction-
ary I am using on my web site, http://www.cris.com/~djd/products.html,
for your convenience as well. This will need to be converted by MkDictdb.
These dictionaries are in a fairly simple ASCII format, so it is pretty
easy to modify or add words to them. So, for instance, the word
"LuxuryYacht" could actually be pronounced Throat-Warbler-Mangrove, or you
could do horrible things to the pronunciation of the word "Microsoft".
Or you could actually do something useful with it - your choice.
Be warned that the resulting Dictionary file is fairly large. Say.exe
looks for a dictionary file in the current working directory named Adict.db
by default, the command line switch -d b causes it to look for the
alternate dictionary bdict.db.
You should also be warned that the conversion process can take several
hours, and that MkDictDB is poorly designed in that it has no progress
meter.
Once you've made the dictionary, say.exe will look for it in your
$ETC directory, so it would be useful to move it there.
Notes for Programmers:
Everything here compiles under EMX/GCC, using Gnumake as the make utility.
You will also need to have GNU GDBM ported over to OS/2 - I highly
recommend Kai Uwe Rommel's port, which can currently be found on
ftp.leo.org. Your mileage may vary with other development environments.
There has been at least one person who has sucessfully recompiled Rsynth
with IBM Cset. I tried here (not very hard) with some success, but had
difficulties with the port of GDBM. So, this version is still on EMX.
Stuff that needs to be done:
Floating Point Underflow problem
This has been a real bugbear to fix - I hope I have it fixed now. If you
get floating point underflow exceptions with say.exe Please Email Me!
Creating a Speech Output Library
It should be fairly easy to take functions from say.c and create a speech
output API - speech_init(), Say_string(), say_file(), Speech_close() would
be a minimal function set. Then put 'em in a DLL. Any Takers?
Mods to Improve Speech Output Quality
The support for sample rates other than 8000 helps. As I only have an 8
bit sound card, currently Rsynth creates 8 bit files, and plays 8 bit
data. Raw data created by Rsynth is 16 bit, so adding a switch for 16 bit
support, and updating the audio_init, audio_play, was definately a
worthwhile thing to do - the sound quality at 16 bit is greatly improved.
It would be nice to figure out how to make 16 bit WAV files as well.
Mods to increase efficiency
One person out there has made mods to multithread calculating and playing
sentences in say.exe. This was a great idea - if I wasn't so lazy I
would have incorporated those changes into this release. Maybe next time,
with 16 bit WAV support.....
DIVE support has been as well. <sigh>
Derek J Decker djd@cris.com